Language model adaptation for tiny adaptation corpora
نویسنده
چکیده
In this paper we address the issue of building language models for very small training sets by adapting existing corpora. In particular we investigate methods that combine task specific unigrams with longer range models trained on a background corpus. We propose a new method to adapt class models and show how fast marginal adaptation can be improved. Instead of estimating the adaptation unigram only on the adaptation corpus, we study specific methods to adapt unigram models as well. In extensive experimental studies we show the effectiveness of the proposed methods. As compared to FMA as described in [1] we obtain an improvement of nearly 60% for ten utterances of adaptation data.
منابع مشابه
Integrating MAP and linear transformation for language model adaptation
This paper discusses the integration of various language model (LM) adaptations. Ways of integrating Maximum A Posteriori (MAP) adaptation and linear transformation of bigram probability vectors are introduced and evaluated. This method leads to little improvements for adaptation corpora of less than 15,000 words. Another method, based on a data augmentation technique by means of a distance bet...
متن کاملPersian Adaptation of Enhanced Milieu Teaching for Iranian Children With Expressive Language Delay
Objectives: This study aimed at adapting and examining the applicability of the Teach-Model-Coach-Review model of the enhanced milieu teaching (EMT) approach for improving Iranian mothers’ language strategies while interacting with their toddlers with expressive language delay. Methods: In a single-subject multiple-baseline across-behavior study, the mothers of 3 toddlers with expressive langu...
متن کاملEFL Classroom Discourse in Iranian Context: Investigating Teacher Talk Adaptation to Students’ Proficiency Level
How language teachers talk is a key factor in organizing and facilitating learning specifically in language classrooms where the medium of instruction is also the subject matter. This study aimed to examine the extent and ways of teacher talk adaptation to students’ proficiency levels in the Iranian EFL context. Two EFL teachers who were teaching three different proficiency levels were observed...
متن کاملLanguage Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions
Appropriate language modeling is one of the major issues for automatic transcription of spontaneous speech. We propose an adaptation method for statistical language models based on both topic and speaker characteristics. This approach is applied for automatic transcription of meetings and panel discussions, in which multiple participants speak on a given topic in their own speaking style. A bas...
متن کاملAdaptation of Reordering Models for Statistical Machine Translation
Previous research on domain adaptation (DA) for statistical machine translation (SMT) has mainly focused on the translation model (TM) and the language model (LM). To the best of our knowledge, there is no previous work on reordering model (RM) adaptation for phrasebased SMT. In this paper, we demonstrate that mixture model adaptation of a lexicalized RM can significantly improve SMT performanc...
متن کامل